Industrial Computing Systems: A Case Study of Fault Tolerance Analysis

نویسنده

  • Andrey A. Shchurov
چکیده

Fault tolerance is a key factor of industrial computing systems design. But in practical terms, these systems, like every commercial product, are under great financial constraints and they have to remain in operational state as long as possible due to their commercial attractiveness. This work provides an analysis of the instantaneous failure rate of these systems at the end of their life-time period. On the basis of this analysis, we determine the effect of a critical increase in the system failure rate and the basic condition of its existence. The next step determines the maintenance scheduling which can help to avoid this effect and to extend the system life-time in fault-tolerant mode. Keywords— reliable computing system, fault tolerance, maintenance scheduling.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An approach to fault detection and correction in design of systems using of Turbo ‎codes‎

We present an approach to design of fault tolerant computing systems. In this paper, a technique is employed that enable the combination of several codes, in order to obtain flexibility in the design of error correcting codes. Code combining techniques are very effective, which one of these codes are turbo codes. The Algorithm-based fault tolerance techniques that to detect errors rely on the c...

متن کامل

Providing Real-Time Applications With Graceful Degradation of QoS and Fault Tolerance According to(m, k)-Firm Model

(m,k)-firm model has recently drawn a lot of attention. It provides a flexible real-time system with graceful degradation of the QoS, thus achieving the fault tolerance in case of system overload. In this paper we first give a review of the existing work on the use of the (m,k)-firm model for handling the QoS and fault tolerance management. Then we focus on DBP algorithm as it presents the inte...

متن کامل

Improving the palbimm scheduling algorithm for fault tolerance in cloud computing

Cloud computing is the latest technology that involves distributed computation over the Internet. It meets the needs of users through sharing resources and using virtual technology. The workflow user applications refer to a set of tasks to be processed within the cloud environment. Scheduling algorithms have a lot to do with the efficiency of cloud computing environments through selection of su...

متن کامل

رویکردی برای حفاظت از عملیات های پردازش داده در سیستم های محاسباتی با استفاده از کدهای کانولوشن

Abstract We present a framework for algorithm-based fault tolerance methods in the design of fault tolerant computing systems. The ABFT error detection technique relies on the comparison of parity values computed in two ways. The parallel processing of input parity values produce output parity values comparable with parity values regenerated from the original processed outputs. Number data proc...

متن کامل

Improvement of the Reliability of Automatic Manufacture Systems by Using FTA Technique

In recent years, Many manufacturing industries for promoting their efficiency have tended to use the automatic manufacturing systems. Expanding automatic systems and to increase their complexity are representing the necessity of studying a proper functional quality and using reliable equipment in such systems more than ever. In this direction, the technique of fault tree analysis (FTA), along w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1503.08715  شماره 

صفحات  -

تاریخ انتشار 2015